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Abstract 

This paper describes a novel method of com- 
pihng ranked tagging rules into a determin- 
istic finite-state device called a bimachine. 
The rules are formulated in the framework 
of regular rewrite operations and allow un- 
restricted regular expressions in both left 
and right rule contexts. The compiler is il- 
lustrated by an application within a speech 
synthesis system. 

1 Motivation 

In rule-based tagging, linguistic objects (e.g. 
phonemes, syllables, or words) are assigned lin- 
guistically meaningful labels based on the con- 
text. Each instance of label assignment is li- 
censed by a tagging rule typically specifying 
that label ip can be assigned to item (/> if (/> is pre- 
ceded by a pattern A and followed by a pattern 
p. The patterns A and p are usually formulated 
as regular expressions over the input alphabet, 
but may also range over output labels. 

The nature of the tagging task suggests a for- 
malisation in terms of finite-state transducers 
(FSTs). More precisely, the task can be viewed 
as an instance of string rewriting. In this frame- 
work, a tagging rule is interpreted ctS Si regular 
rewrite rule (p — > tp/X-p. 

Several methods have been proposed for the 
compilation of such rules into FSTs (Kaplan 
and Kay, 1994; Mohri and Sproat, 1996; Gerde- 
mann and van Noord, 1999). A rewrite rule is 
converted into a number of transducers, which 
are combined by means of transducer composi- 
tion, yielding an EST that implements the ac- 
tual rewrite operation. 

Typically, tagging is carried out by a set of 
rules Ri : (pi ^ ipi/Xi-Pi, i = 1 . . .n, which may 
overlap and/or conflict. A regular rule compiler 
should not just convert the rules into separate 
transducers Ti . . . T„. Eor efficiency reasons, it 
is highly desirable to convert them into a sin- 
gle machine in a way that determines how rule 



conflicts should be resolved. 

There are two basic options as to how the rule 
transducers can be combined. 

• The rules can be associated with numerical 
costs, which translate as transition weights 
in the compilation step. In this formalisa- 
tion, the union Ti U . . . U r„ is a weighted 
finite-state transducer (WEST). This trans- 
ducer is typically non-deterministic, but 
the weights make it possible to find the op- 
timal path efficiently (as an instance of the 
single-source shortest paths problem). 

• The rules can be explicitly ranked. In such 
a case, priority union (Karttunen, 1998), or 
an equivalent operator, can be used to com- 
bine Ti , . . . , r„ into a single unambiguous 
EST which is then turned into a determin- 
istic device (Skut et al., 2004). 

The work reported here pursues the latter strat- 
egy. Although less powerful and flexible than 
e.g. probabilistic approaches, it has the advan- 
tage of efficiency: once the rules have been com- 
piled, rewriting an input sequence of t symbols 
boils down to t lookups in a transition table (2t 
in case of a bimachine, see below). ^ 

Several compilation methods have been pro- 
posed for creating a deterministic machine out 
of a set of rules (Laporte, 1997; Roche and 
Schabes, 1995; Hetherington, 2001). However, 
most of them impose strong restrictions on the 
form of contextual constraints: A and p are re- 
stricted to single symbols (Hetherington, 2001), 
or acyclic regular expressions (Laporte, 1997). 

Skut et al. (2004) describe a more powerful 
rewrite rule compiler that does not impose such 
constraints on cp, A and p. Each rule Ri is com- 
piled into an unambiguous EST Ti that inserts 
a marker at the beginning of every match of 



^With hand-written rules, the simple ranking is actu- 
ally an advantage since a more complex rule interaction 
typically affects the transparency of the system. 



(f)i preceded by an instance of Aj and followed 
by an instance of pi. While pi and Aj may con- 
tain markers inserted by other rules, (f>i must 
be marker-free. The composition Ti o . . . o T„ 
of the rule transducers yields an FST that in- 
serts rule markers into the input string, resolv- 
ing rule conflicts according to the explicit rank- 
ing of rules. Composed with a transducer that 
performs the actual rewrite operation, it pro- 
duces an unambiguous FST which implements 
the required combination of rules. 

Two problems arise with this approach. 

• Although the resulting FST is unambigu- 
ous (i.e., implements a function), it may 
be non-determinisable (Mohri, 1997a; La- 
porte, 1997). 

• The composition operation used to com- 
bine the ranked rule transducers quickly 
creates large non-deterministic FSTs, re- 
sulting in slow compilation and high mem- 
ory requirements. 

The remedy to the first problem is straightfor- 
ward: since the resulting FST implements a 
function, it can be compiled into a bimachine, 
i.e. an aggregate of a left-to-right and a right-to- 
left deterministic finite-state automaton (FSA) 
associated with an output function. The appli- 
cation of such a bimachine to a string involves 
running both automata (in the respective direc- 
tions) and determining the symbols emitted by 
the output function (cf. section 2.1). 

The simplest option is thus first to create 
the rule transducers, then compose them into 
a (non-deterministic) FST, and finally apply a 
bimachine construction method (1996). How- 
ever, such a solution will not eliminate the inef- 
ficiency caused by expensive rule composition. 

Thus, we have developed a compilation 
method that constructs the left-to-right and the 
right-to-left automaton of the resulting bima- 
chine directly from the patterns without hav- 
ing to construct and then to compose the rule 
transducers. The efficiency of the compiler is in- 
creased by employing finite-state automata in- 
stead of FSTs, since algorithms used to process 
FSAs arc typically faster than the correspond- 
ing transducer algorithms. Furthermore, the 
resulting (intermediate) structures are signifi- 
cantly smaller than in the case of FSTs. This 
leads to much faster compilation and smaller 
finite-state machines. 



2 Formalization 

2.1 Definitions and Notation 

In the following definitions, S denotes a finite 
input alphabet. A is a finite output alphabet. 

A deterministic finite-state automxiton 
(DFSA) is a quintuple A = {T,,Q,qQ,6, F) such 
that: 

Q is a finite set of states; 

go £ Q is the initial state of A] 

5 : Q X T, ^ Q is the transition function of A; 

F C Q is a non-empty set of final states. 

A sequential transducer (ST) is defined as 
a 7-tuple T = (S, A, Q, go, cr, -P) such that 
(E, Q, go, S, F) is a DFSA, and a{q, a) is the out- 
put associated with the transition from state q 
via symbol a to state S{q, a). 

The functions S and a can be extended to 
the domain Q x S* by the recursive defini- 
tion: S*{q,€) = q, d*{q,wa) = 6{S*{q,w),a), 
It; e S*, a € S and a*{q,e) = e, a*{q,wa) = 
a*{q,w)a{S*{q,w),a). ^ ^ 

A bimachine B is & triple (A,A,h) such that: 

A = (S,Q,"go, (J) is a left-to-right FSA (there 
is no concept of final states in a bimachine) ; 

A = (S,Q, Vo, (^) is a right-to-left FSA; 

h : Q xH X Q ^ A* is the output function. 

Applied to a string u = ai . . . at, B produces a 
string f = 6i ■ . . . • 6(, such that hi G A* is defined 
as follows: 

bi = h{6 (qo, ai . . . a^-i), a^, S (qg, at... aj+i)) 

An unambiguous finite-state transducer can al- 
ways be converted into a bimachine (Berstel, 
1979; Roche and Schabes, 1996). This property 
makes bimachines an attractive tool for deter- 
ministic processing, especially since not all un- 
ambiguous transducers are determinisable (se- 
quentiable). 

2.2 Rules, Rule Ordering and Priorities 

As mentioned in section 1, the tagging rule for- 
malism is formulated in the framework of regu- 
lar rewrite rules (Kaplan and Kay, 1994). Input 
to the rule compiler consists of a set of rules. 

Ri: (pl ipi/Xi -Pi 



A rule (f) ip/X-p states that label ^ is as- 
signed to object <t) (called the focus of the rule) 
if (p is preceded by a left context A and followed 
by a right context p. The context descriptions 
A, p and are formulated as regular expressions 
over the input alphabet S. 

The rules may conflict, in which case the am- 
biguity is resolved based on the order of the 
rules in the grammar. If a rule Ri fires for an ob- 
ject Sfe in the input sequence si . . . sj, it blocks 
the application of all rules Rj, j > i, to s^. 

Thus the operational semantics of the rules 
may be stated as follows: A rule Ri fires if: 

(a) no other rule Rj, j < i, is applicable in the 
same context; 

(b) the substring si . . . Sk-i matches the regu- 
lar expression Il*Ai; 

(c) the substring Sk ■ ■ ■ st matches the regular 
expression (piPiT,*. 

This basic formalism imposes two conditions on 
the rules: 

• A and p are regular expressions over S; 

• (j) € T, is a single object. 

These two assumptions restrict the expressive 
power of the formalism compared to general reg- 
ular rewrite rules (Kaplan and Kay, 1994; Mohri 
and Sproat, 1996) in that they do not allow out- 
put symbols on either side of the context and 
only admit rule foci of length one. Although 
these restrictions are essential for the initial ba- 
sic formalism, we show in section 3 how to ex- 
tend it so that the compiler can accept rules 
with both input and output symbols in the left 
context. As for the length of the focus, it is im- 
portant to bear in mind that the formalism is 
primarily intended for tagging rules which usu- 
ally do not cover longer foci. 

2.3 Matching of Context Patterns 

The basic idea in the new compilation method is 
to convert the patterns Aj and Pi(pi, i = 1, . . . n, 
directly into the left-to-right and right-to-left 
acceptor of a bimachine without having to per- 
form the fairly expensive operations required by 
the transducer-based approaches. 

Key to the solution is the function 
SimultMatchp^,„/3„ : S* ^ {2^y+^, t £ N, 
which, given a collection . . . , of reg- 
ular expressions, maps a sequence of symbols 
si . . . St to a sequence of t -|- 1 sets of indices 
corresponding to the matching patterns at the 



respective position (position corresponds to 
the beginning of the string, I denotes the set 
{1, . . . , n} of rule indices): 

SimultMatchj3^,,,p^{si . . . st)[k] = 

{j e I : matches si . . . Sk} 

This construct can be implemented as a pair 
(^^i.../3„,t) such that: 

^/3i.../3„ = (^) Q) 'ZO) -^) is a finite-state au- 
tomaton that encodes in its states (Q) in- 
formation about the matching patterns. 

r : Q — ^ 2^ is a function mapping the states of 
A to sets of indices corresponding to match- 
ing regular expressions: T{5*{qo,w)) = 
{j e I : S*/3j matches w}. 

In order to construct SimultMatchfj-^,,,^^, we in- 
troduce a marker symbol $j ^ S for each /3j. 
Let S$ = E U Ujg/{%} ^6 the extended alpha- 
bet. Let A = (S$, (5, (fo) ^) be a determin- 
istic finite acceptor for the regular expressions 

An important property of the automaton A 
is that w GT,* is an instance of a pattern 
if and only if g = S*{qo, w) is defined and there 
exists a transition from q by $j to a final state: 
5{5*{qo,w),$j) G F. Now we can define the 
function t : Q —>■ 2^: 

r{q) ={jel: (q, $,) G Dom{S) A S{q, $,) G F} 

Obviously, if A enters state q after consuming 
a string u; G S*, r(g) is the set of all indices j 
such that w matches 

The automaton Ap^^^^p^ = (S, Q, qo, 5, Q) can 
now be constructed from A by restricting it to 
the alphabet S (which includes trimming away 
the unreachable states) and making all its states 
final so that it accepts all strings w ^Ti*: 

Q = {gGQ:3w;GS* 6*{qo,w)=q} 

qo = qo 

The resulting construct (^/3i. ../?„, t) makes it 
possible to simultaneously match a collection of 
regular expressions. 

2.4 Bimachine Compilation 

Using the construct SimultMatch, we can de- 
termine all the matching left and right contexts 
at any position A; in a string w = ai...at. 



The value of SimultMatch\^^\^{w)\k — 1] 
is the set of ah rule indices i such 
that \i matches the string ai...ak-i- 
SimultMatc\^^p^)-i_(^^^^p^)-i{w-^)[t - k] 
is the set of all rule indices i such that (piPi 
matches the remainder Sk ■ ■ ■ st oiw. Obviously, 
the intersection SimultMatch\^,,,\^{w)[k — 1] fl 
SimultMatch(^^p^yi j^^^p^)-i{u]^'^)[t - k] is 
exactly the set of all matching rules at position 
k. The minimal element of this set is the index 
of the firing rule.^ 

Now if {A, t) := SimultMatchx^___x^, and 

{A,t) := SimultMatch(j,^p.^yi j^^^^p^^-j-i, then 
the tagging task is performed by the bima- 

chine B = {A, A, h), where the output function 

/irQxSxQ— >A*is defined as follows: 

min( T [q ) n T ( (q ,a))) 

The output function can be either precompiled 
(e.g., into a hash table), or - if the resulting 
table is too large - the intersection operation 
can be performed at runtime, e.g. using a bitset 
encoding of sets.^ 

3 Extensions 

The compiler introduced in the previous section 
can be extended to handle more sophisticated 
rules and search/control strategies. 

3.1 Output Symbols in Left Contexts 

The rule formalism can be extended by includ- 
ing output symbols in the left context of a rule. 
This extra bit is added in the form of a regular 
expression tt ranging over the output symbols, 
which can be represented by rule IDs G /. 
The rules then look as follows: 

Ri = <Pi — »■ V'i/TTi : Aj _/9j 

^In order to ensure that the above formula is always 
vahd, we assume that the rule with the highest index 
(i?n) matches all left and right contexts (i.e., \n = pn = 
E*, (/)„ = Uo-gsi"^})' ''■'^'i ^ vacuous action. If none 

of the other rules fire, the formalism defaults to Fia. 

^In the actual implementation of the tagger, h has 

been replaced by a function g : Q x Q — > A* defined as: 

gil^l) = '''^min(VCJ) n V(V)) 

The translation of the fc-th symbol in a string w = 
ai . . .at is then determined by the formula 

g{S{q„,ai . . . ak-i), 6 (qo,at ■ ..Uk)) 
which is easier to compute. 



Such a rule fires at a position k in string si . . . Sf 
if an extra condition (d) holds in addition to the 
conditions (a)-(c), formulated in section 2.2: 

(d) The IDs ri . . . r^-i of the firing rules match 

In order to enforce condition (d), we use 
the SimultMatch construct introduced in sec- 
tion 2.3. For that, the patterns vr = 
{tti, . . . , 7r„} are compiled into an instance of 
SimultMatch^ = {At^,Tt^). A-^ is an FSA, so 
At^ = (I, Qtt, 9o ' '^tt)- It follows from the con- 
struction of SimultMatch^ that the function 
T-K '■ Qir ^ 2^ has the following property: 

{ j G / : ri . . . Tfe matches nj } 

In other words, an action ipr^ is admissible at 
position fc if rfc G r7r((^* (go, '^i • • • rk-i))- Thus, 
the tagging task (according to the extended 
strategy (a)-(d)) is performed by the formal 

machine M = {A, A^, A, h), where A and A are 

as in section 2.4, and h : Q X Q^^ X T, X Q ^ A* 
is defined as follows:^ 

h{q,q'',a,q') = -0^^, 

where 

Tk := min(7(g) n T^(g^) n T{6{q', a))) 

In this formula, q and q are as in the ba- 
sic bimacliinc introduced in section 2.4. q^ is 
the state of Aj^ after consuming the rule IDs 
n . ■■rk-i: q^ := 5*(gJ,ri . . .rfc_i). 

In order to determine the tagging actions for 
an input sequence w = ai . . . at, the automaton 

A is first run on w~^. Then both A and At^ are 
run on w in parallel. In each step k, the states 

<^ (^o, fli • • • flfe-i), 5 { qQ. at . . . ttk) as well as the 
sequence ri . . . Vk-i of already executed actions 
are known, so that the s can be determined 
incrementally from left to right. 

3.2 Alternative Control Strategies 

Our rule compilation method is very flexible 
with respect to control strategies. By inter- 
secting the sets of rule IDs t{S {qQ,ai . . . ak-i)) 

"'in order to make sure the definition of h is always 
valid, we assume that the rule with the highest index (n) 
matches all possible contexts (i.e., -Kn = I* , Xn = Pn = 
E* and<^„ = U.gEW)- 



and r (5 {qq, at ■ ■ ■ flfe)), 1 < k < t, one can de- 
termine the set of all matching rules for each 
position in the input string. In the formalism 
presented in section 2.4, only one rule is se- 
lected, namely the one with the minimal ID. 
This is probably the most common way of han- 
dling rule conflicts, but the formalism does not 
exclude other control strategies. 

Simultaneous matching of all rules: 

This strategy is particularly useful in 

the machine-learning scenario, e.g. in 
computing scores in transformation-based 
learning (Brill, 1995). Note that the 
simple context rules used by Brill (1995) 
may be mixed with more sophisticated 
hand-written heuristics formulated as 
regular expressions while still being sub- 
ject to scoring. As shown in section 2.2, 
taggers using unrestricted regular context 
constraints are not sequentiable, and thus 
cannot be implemented using ST-based 
rule compilation methods (Roche and 
Schabes, 1995). 

N-best/Viterbi search: Instead of a strict 
ranking, the rules may be associated with 

probabilities or scores such that the best se- 
quence of actions is picked based on global, 
per-sequence, optimisation rather than on 
a sequence of greedy local decisions. In or- 
der to implement this, we can use the ex- 
tended formalism introduced in section 3.1 
with a slight modification: in each step, we 
keep N best-scoring paths rather than just 
the one determined by the selection of the 
locally optimal action ijjj.^. ioT 1 < k < t. 

4 An Application 

In this section, we describe how our bimachine 
compiler has been applied to the task of ho- 
mograph disambiguation in the rVoice speech 
synthesis system. 

Each module in the system adds information 
to a structured relation graph (HRG), which 
represents the input sentence or utterance to be 
spoken (Taylor et al., 2001). The HRG consists 
of several relations, which are structures such 
as lists or trees over a set of items. The ho- 
mograph tagger works on a list relation, where 
the items represent words. Each item has a fea- 
ture structure associated with it, the most rele- 
vant features for our application being the name 
feature representing the normalised word and 



the pos feature representing the part-of-speech 
(POS) of the word. 

The assignment of POS tags is done by a 
statistical tagger (a trigram HMM). Its output 
is often sufficient to disambiguate homographs, 
but in some cases POS cannot discriminate be- 
tween two different pronunciations, as in the 
case of the word lead: they took a 1-0 lead vs. a 
lead pipe (both nouns). Furthermore, the statis- 
tical tagger turns out to be less reliable in cer- 
tain contexts. The rule-based homograph tag- 
ger is a convenient way of fixing such problems. 

The grammar of the homograph tagger con- 
sists of a set of ordered rules that define a map- 
ping from an item to a sense ID, which uniquely 
identifies the phonetic transcription of the item 
in a pronunciation lexicon. 

For better readability, we have changed the 
rule syntax. Instead of (j) - p, we write: 

where A, (f) and p are regular expressions over 
feature structures. The feature structures are 
written [/i = f i • . . /fc = Vk] , where /j is a fea- 
ture name and Vi an atomic value or a disjunc- 
tion of atomic values for that feature. Each 
attribute- value pair constitutes a separate input 
alphabet symbol. The alphabet also contains 
a special default symbol that denotes feature- 
value pairs not appearing in the rules. The sym- 
bol stands for the action of setting the sense 
feature of the item to a particular sense ID. 

The following are examples of some of the 
rules that disambiguate between the different 
senses of suspects (sense=l is the noun read- 
ing, sense=2 the verb reading): 

[name=that] 

/ [name=suspects] / 

-> [sense=2] ; 

( [pos=dt I cd] I [name=terror] ) 
/ [name=suspects] / 

-> [sense=l] ; 

/ [name=suspects] / 

[name=that] 

-> [sense=2] ; 
/ [name=suspects] / -> [sense=l] ; 

Note that the last rule is a default one that 
sets sense to 1 for all instances of the word sus- 
pects where none of the other rules fire. 

To explain the interaction of the rules, we will 
look at the following example: 

thei terror2 suspects^ that^ were^ inQ courts 



We can see that the second and the third rule 
match the context of word 3. The rule associ- 
ated with the lower index fires, resulting in the 
value of sense being set to 1 on the item. 

5 Performance Evaluation 

To evaluate the performance of the new com- 
pilation method, we measured the compilation 
time and the size of the resulting structures 
for a set of homograph disambiguation rules in 
the format described in section 4. The results 
were compared to the results achieved using a 
compiler that converts each rule into an FST 
and then composes the FSTs and determinises 
the transducer created by composition (Skut et 
al., 2004). Both algorithms were implemented 
in C++ using the same library of FST/FSA 
classes, so the results solely reflect the differ- 
ence between the algorithms. 

The figures (la)-(lc) on page 6 show the re- 
sults of running both implementations on a Pen- 
tium 4 1.7 GHz processor for rule sets of dif- 
ferent sizes. Figure (la) shows the number of 
states, (lb) the number of transitions, and (Ic) 
the compilation time. The numbers of states 
and transitions for A2, the bimachine-based ap- 
proach proposed in this paper, are the sums 
of the states and transitions, respectively, for 
the left-to-right and right-to-lcft acceptors. The 
left-to-right FSA typically has a much smaller 
number of states and transitions than the right- 
to-left FSA (only 10% of its states and 2-5% 
of its transitions) since it does not contain the 
regular expression for the rule focus. 

While the figures show a substantial de- 
crease in runtime for the bimachine construc- 
tion method (A2) compared to the FST-based 
approach Ai (only 6.48 seconds instead of 
115.29 seconds for the largest set of 40 rules 
in (Ic)), the numbers of states and transitions 
are slightly larger for the bimachine. Typically 
the FSAs have about 25% more states and 35% 
more transitions than the corresponding STs in 
our test set. However, an FSA takes up less 
memory than an FST as there are no emissions 
associated with transitions and the output func- 
tion h can be encoded in a very space-efficient 
way. As a result, the size of the compiled struc- 
ture in RAM was down by almost 30% com- 
pared to the size of the original transducer. 

6 Conclusion 

The rule compiler described in this paper 
presents an attractive alternative to compila- 




5 10 15 20 25 30 
(la) Number of states S 



35 40 n 



Tr 35000 
30000 
25000 
20000 
15000 
10000 
5000 
0, 









AtT 




r-' 




1 

1 







5 10 15 20 25 30 35 40 n 
(lb) Number of transitions TV 



t 120 




5 10 15 20 25 30 35 40 n 
(Ic) Runtime in seconds t 

Figure 1: Comparison of two compilation algo- 
rithms. Al is an transducer-based construction 
method (Skut et al., 2004), A2 the approach 
proposed in this paper. The figures show the 
numbers of states (la), the number of transi- 
tions (lb) and the runtime (Ic) depending on 
the size of the input rule file (n is the number 
of rules) . 



tion methods that use FST composition and 
complement in order to convert rewrite rules 
into finite-state transducers. The direct com- 
bination of context patterns into an acceptor 
with final outputs makes it possible to avoid the 
use of relatively costly FST algorithms. In the 
present implementation, the only potentially ex- 
pensive routine is the creation of the determin- 
istic acceptors for the context patterns. How- 
ever, if the task is to create a deterministic de- 
vice, determinisation (in its more expensive ver- 
sion for FSTs) is also required in the FST-based 
approaches (Skut et al., 2004). The experi- 
mental results presented in section 5 show that 
compilation speed is not a problem in practice. 
Should it become an issue, there is still room for 
optimisation. The potential bottleneck due to 
DFSA determinisation can be eliminated if we 
use a generalisation of the Aho-Corasick string 
matching algorithm (Aho and Corasick, 1975) 
in order to construct the deterministic acceptor 
for the language S* Uje/ Pj^j while creating the 
SimultMatch construct (Mohri, 1997b). 

By constructing a single deterministic device, 
we pursue a strategy similar to the compila- 
tion algorithms described by Laporte (1997) 
and Hetherington (2001). Our method shares 
some of their properties such as the restric- 
tion of the rule focus (j) to one input symbol.^ 
However, it is more powerful as it allows unre- 
stricted (also cyclic) regular expressions in both 
the left and the right rule context. The practi- 
cal significance of this extra feature is substan- 
tial: unlike phonological rewrite rules (the topic 
of both Laporte and Hcthcrington's work), ho- 
mograph disambiguation does involve inspect- 
ing non-local contexts, which often pose a diffi- 
culty to the 3-gram HMM tagger used to assign 
POS tags in our system. 

Although the use of our compiler is currently 
restricted to hand-written rules, the extensions 
sketched in section 3.2 make it possible to use it 
in a machine learning scenario (for both training 
and run-time application). 

Our rule compiler has been applied success- 
fully to a range of tasks in the domain of 
speech synthesis, including homograph resolu- 
tion, post-lexical processing and phrase break 
prediction. In all these applications, it has 
proved to be a useful and reliable tool for the 
development of large rule systems. 

^As pointed out in section 2.2, this restriction does 
not pose a problem as the compiler is primarily designed 
for rule-based tagging. 
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